This page is a quick reference checkpoint for CUME_DIST in Spark SQL: behavior, syntax rules, edge cases, and a minimal example; plus the official vendor documentation.
CUME_DIST returns the cumulative distribution, showing the proportion of rows with values less than or equal to the current row.
Returns the cumulative distribution within the window's ORDER BY.
If this behavior feels unintuitive, the tutorial below explains the underlying pattern step-by-step.
`CUME_DIST()` takes no arguments and must be used with an OVER clause.
SELECT category, amount, CUME_DIST() OVER (PARTITION BY category ORDER BY amount) AS cume_dist FROM sales;
If you came here to confirm syntax, you’re done. If you came here to get better at window functions, choose your next step.
CUME_DIST is part of a bigger window-function pattern. If you want the “why”, start here: Percentile Distribution
Reading docs is useful. Writing the query correctly under pressure is the skill.
For the authoritative spec, use the vendor docs. This page is the fast “sanity check”.
View Spark SQL Documentation →Looking for more functions across all SQL dialects? Visit the full SQL Dialects & Window Functions Documentation.